Skip to content

Add instruction execution limit to prevent infinite loops #18679

Merged
lucylq merged 1 commit intomainfrom
security23
Apr 21, 2026
Merged

Add instruction execution limit to prevent infinite loops #18679
lucylq merged 1 commit intomainfrom
security23

Conversation

@lucylq
Copy link
Copy Markdown
Contributor

@lucylq lucylq commented Apr 2, 2026

JumpFalseCall instructions can set destination_instruction to themselves, creating an infinite loop that hangs the runtime. This adds a configurable instruction counter (default 10M, overridable via -DET_MAX_INSTRUCTIONS) to Method::execute() that returns Error::InvalidState if exceeded.

This PR was authored with the assistance of Claude.

Test plan

Existing tests

cmake -B build -DEXECUTORCH_BUILD_TESTS=ON
cmake --build build --target method_test
ctest --test-dir build -R method_test --output-on-failure

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 2, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18679

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit 0047105 with merge base 5e8a0df (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 2, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@lucylq lucylq marked this pull request as ready for review April 20, 2026 22:05
@lucylq lucylq requested a review from JacobSzwejbka as a code owner April 20, 2026 22:05
Copilot AI review requested due to automatic review settings April 20, 2026 22:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a safety guard in the ExecuTorch runtime to prevent Method::execute() from hanging indefinitely when executing malformed programs that can loop forever (e.g., self-looping jump instructions).

Changes:

  • Introduces a compile-time configurable maximum instruction execution count (ET_MAX_INSTRUCTIONS, default 10,000,000).
  • Tracks executed instruction count inside Method::execute() and aborts with an error when the limit is exceeded.
  • Emits an error log when the limit is hit and resets execution state before returning.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread runtime/executor/method.cpp
Comment thread runtime/executor/method.cpp Outdated
Comment thread runtime/executor/method.cpp Outdated
Comment on lines +1692 to +1694
". Possible infinite loop detected.",
kMaxInstructions,
step_state_.chain_idx);
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message doesn’t tell users how to remediate legitimate long-running models that exceed the default limit. Consider including the current instruction index (step_state_.instr_idx) and a hint that the limit is configurable via -DET_MAX_INSTRUCTIONS (or ET_MAX_INSTRUCTIONS) so operators can adjust it without digging into the code.

Suggested change
". Possible infinite loop detected.",
kMaxInstructions,
step_state_.chain_idx);
" at instruction %" ET_PRIsize_t
". Possible infinite loop detected. For legitimate long-running "
"models, increase the limit via -DET_MAX_INSTRUCTIONS "
"(ET_MAX_INSTRUCTIONS).",
kMaxInstructions,
step_state_.chain_idx,
step_state_.instr_idx);

Copilot uses AI. Check for mistakes.
Comment on lines +1687 to +1697
if (instruction_count >= kMaxInstructions) {
ET_LOG(
Error,
"Instruction execution limit (%" ET_PRIsize_t
") exceeded in chain %" ET_PRIsize_t
". Possible infinite loop detected.",
kMaxInstructions,
step_state_.chain_idx);
step_state_ = StepState{0, 0};
return Error::InvalidState;
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a new failure mode in Method::execute() that should be covered by a regression test. Consider adding a unit test that builds a minimal flatbuffer program with a self-looping jump (or otherwise forces an infinite instruction loop) and asserts execute() fails once the instruction limit is exceeded (ideally with a small ET_MAX_INSTRUCTIONS value for the test target to keep runtime short).

Copilot uses AI. Check for mistakes.
@lucylq
Copy link
Copy Markdown
Contributor Author

lucylq commented Apr 20, 2026

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 20, 2026

Claude finished @lucylq's task in 1m 12s —— View job


Review of PR #18679

  • Read the diff and changed files
  • Read surrounding code for context
  • Analyze the change and provide review feedback

Good defensive measure. The change is small and well-scoped. A few observations:

1. Error code: InvalidState vs InvalidProgram

The current code returns Error::InvalidState, but this condition is triggered by a malformed program (self-looping jump instruction), not by the runtime being in an invalid state. Error::InvalidProgram (0x23, "Error caused by the contents of a program") is a better semantic fit. This also avoids confusion with the existing InvalidState return at method.cpp:1657 which guards against calling execute() while already in progress — callers may want to distinguish these two failure modes.

Fix this →

2. Negative/zero ET_MAX_INSTRUCTIONS is silently accepted

Since kMaxInstructions is size_t (unsigned), passing -DET_MAX_INSTRUCTIONS=0 would disable the guard entirely, and -DET_MAX_INSTRUCTIONS=-1 would wrap to SIZE_MAX, also effectively disabling it. A static_assert would catch misconfigurations at build time:

static_assert(ET_MAX_INSTRUCTIONS > 0, "ET_MAX_INSTRUCTIONS must be positive");

This is a minor point — arguably someone setting it to 0 intends to "disable" it — but if the intent is that the limit is always active, a static assert is worth adding.

3. Log message could include instruction index and remediation hint

The current log shows the limit and chain index, but not step_state_.instr_idx. Including the instruction index helps identify exactly which instruction is looping. A brief mention of -DET_MAX_INSTRUCTIONS in the log would also help users who hit the limit with legitimate large models.

4. step() is not guarded

The step() method (method.cpp:1589) executes one instruction per call and has no instruction counting. Since step() is caller-controlled (the caller decides when to stop calling it), this is fine — the caller is already in the loop and can enforce their own limits. Noting this for completeness; no action needed.

5. No dedicated test for the new failure mode

The test plan references existing method_test, but there's no new test that constructs a program with a self-looping jump and asserts the limit is hit. A test with a small ET_MAX_INSTRUCTIONS value (e.g., via a separate test target with -DET_MAX_INSTRUCTIONS=100) would provide regression coverage for this specific path and confirm the error code returned.


Summary: The change is sound and addresses a real denial-of-service vector from malformed programs. The two actionable suggestions are (1) using Error::InvalidProgram instead of Error::InvalidState for clearer error semantics, and (2) adding a static_assert to reject non-positive limit values at compile time. The rest are minor improvements.

@lucylq lucylq changed the title Add instruction execution limit to prevent infinite loops (TOB-EXECUT… Add instruction execution limit to prevent infinite loops Apr 20, 2026
…ORCH-23)

JumpFalseCall instructions can set destination_instruction to themselves,
creating an infinite loop that hangs the runtime. This adds a configurable
instruction counter (default 10M, overridable via -DET_MAX_INSTRUCTIONS) to
Method::execute() that returns Error::InvalidState if exceeded.

This PR was authored with the assistance of Claude.
@lucylq lucylq merged commit c4983ce into main Apr 21, 2026
165 of 168 checks passed
@lucylq lucylq deleted the security23 branch April 21, 2026 05:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants